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UNIVERSALLY OPTIMAL CROSSOVER DESIGNS UNDER 
SUBJECT DROPOUT 

By Wei Zheng 

Indiana University-Purdue University Indianapolis 

Subject dropout is very common in practical applications of cross- 
over designs. However, there is very limited design literature taking 
this into account. Optimality results have not yet been well estab- 
lished due to the complexity of the problem. This paper establishes 
feasible, as well as necessary and sufficient conditions for a crossover 
design to be universally optimal in approximate design theory in the 
presence of subject dropout. These conditions are essentially linear 
equations with respect to proportions of all possible treatment se- 
quences being applied to subjects and hence they can be easily solved. 
A general algorithm is proposed to derive exact designs which are 
shown to be efficient and robust. 

1. Introduction. Crossover designs have been widely used in industry 
due to their cost effectiveness and statistical efficiency They are applicable 
for experiments aiming to compare effects of different treatments by applying 
them to a number of subjects across several periods. The response observa- 
tion is typically modeled by additive effects of subjects, periods, treatments 
and the carryover effects of the treatment from the previous period. There 
has been tremendous amount of literature regarding the identification of op- 
timal designs. See Hedayat and Afsarinejad (1978), Cheng and Wu (1980), 
Kunert (1984), Stufken (1991), Kushner (1997a, 1997b, 1998), Kunert and 
Martin (2000), Kunert and Stufken (2002), Hedayat and Yang (2003, 2004) 
and Hedayat and Zheng (2010), for instance. For comprehensive reviews, see 
Matthews (1988), Ratkowsky, Evans and Alldredge (1992), Stufken (1996), 
Jones and Kenward (2003), Senn (2003) and Bose and Dey (2009). 

An important issue regarding crossover designs is that subject may drop 
out of the study. As a result, the experiment will not be carried out as 
planned. Matthews (1988) commented this is one of the main concerns of 
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crossover designs. Low, Lewis and Prescott (1999) observed that "A dropout 
rate of between 5% and 10% is not uncommon and, in some areas, can be 
as high as 25%." Meanwhile, a design, which is optimal or highly efficient in 
the absence of dropout, would become inefficient or even disconnected in the 
presence of subject dropout. Examples could be found in Godolphin (2004), 
Majumdar, Dean and Lewis (2008) as well as Section 5 of this paper. 

To conclude, it is very important to find optimal or efficient designs in 
the presence of subject dropout, yet there is very limited literature on this. 
Bose and Bagchi (2008) derived designs which are universally optimal for 
both direct and carryover effects for both the situation of no dropout and 
the situation that all subjects drop out after period q with q being judi- 
ciously chosen. Similar results are presented by Majumdar, Dean and Lewis 
(2008). The latter restricted the comparison of designs within the subclass of 
uniformly balanced repeated measurement designs (UBRMDs), whose opti- 
mality property has been well recognized in literature for the situation of no 
dropout. For the second situation with any given q, they proposed type W g 
UBRMDs, which reduce the maximum loss of the information for parame- 
ters in terms of ^4-criterion as compared to general UBRMDs. Following the 
latter paper, Zhao and Majumdar (2012) further explored the special case 
when q is one less the number of periods and the numbers of treatments and 
periods are the same. 

The previous three papers share two drawbacks: (i) The proposed de- 
signs exist only under very rare combinations of the numbers of subjects, 
periods and treatments. See Section 5.1.2 for relevant discussions of the for- 
mer paper. As for the other two papers, it is well known that the existence 
of UBRMDs is rare, (ii) The information regarding the mechanism of how 
subjects drop out was not taken into account. 

To address the latter drawback, it is plausible to measure the perfor- 
mance of designs by taking the expectation of a regular optimality criterion 
with respect to the mechanism of subject dropout. Low, Lewis and Prescott 
(1999) worked in this direction by using intensive computer programming. 
They concluded that when the Latin squares consisting of the design is more 
diverse, the resulting design performs better in terms of both efficiency and 
robustness. This argument is further supported by the comparison in Sec- 
tion 5. However, the case studies they provided fail to provide general guid- 
ance in identifying efficient designs. To serve this purpose, theoretical results 
are called for. 

In this paper, we develop feasible equivalent conditions for a design to 
be universally optimal for direct treatment effects in approximate design 
theory under the same setup as that of Low, Lewis and Prescott (1999). The 
equivalence holds for any probability distribution of subject dropout. The 
results can be easily modified to find optimal or highly efficient exact designs 
for any combination of the numbers of subjects, periods and treatments. As 
a result, the two drawbacks are both addressed here. 



CROSSOVER UNDER DROPOUT 



3 



The rest of the paper is organized as follows. Section 2 formulates the 
problem, introduces notation and gives some preliminary results. Section 3 
introduces necessary concepts in approximate design theory, proves the ex- 
istence of universally optimal designs and also gives necessary, sufficient and 
equivalent conditions for universal optimality. Section 4 gives explicit and 
feasible forms of optimality conditions in terms of linear equations, which are 
built upon the preceding section. Section 5 further provides a general algo- 
rithm for deriving an optimal or efficient exact design for any combination 
of the numbers of subjects, periods and treatments as well as any prob- 
ability distribution of subject dropout. Besides, comparisons are made to 
designs in literature. Section 6 summarizes the results. Finally, some proofs 
are deferred to Section 7. 

2. Framework. This section introduces the framework of the problem. 
Section 2.1 introduces the statistical model for the design problem and pro- 
vides notation and assumptions necessary to the rest of the paper. Sec- 
tion 2.2 defines an ideal target function in finding a design, proposes a cor- 
responding surrogate target function, and discusses the relationship between 
these two target functions. Section 2.3 provides some preliminary results as 
a preparation for the rest of the paper. 

2.1. Modeling and notation. In a crossover design with p periods, t treat- 
ments and n subjects, the response is typically modeled as 

(1) Ydku = H + TTfc + S u + T d (k,u) + ld(k-l,u) + s ku, 

where {efc«, l<k<p,l<u<n} are independent with mean zero and vari- 
ance a 2 . Here, Ydku denotes the response from subject u in period k to which 
treatment d(k,u) £ {1,2, . . . ,t} was assigned by design d. Furthermore, /j, is 
the general mean, ir^ is the kth period effect, q u is the uth. subject effect, 
T d(k,u) is t ne (direct) treatment effect of treatment d(k,u) and "fd(k-i,u) is 
the carryover effect of treatment d(k — l,u) that subject u received in the 
previous period (by convention 7d(o,u) = 0). 

Let G be a temporary object whose meaning differs from context to 
context. Then we define G' to represent the transpose of the matrix G, 
G~ to represent a generalized inverse of the matrix G, tr(G?) to represent 
the trace of the matrix G and pr -1- to be a projection operator such that 
pr^ G = I — G{G'G)~G' . For two square matrices of equal size, G± and G2, 
G\ < G2 means that G2 — G\ is nonnegative definite. For a set 67, the number 
of elements in the set is represented by 

Besides, Ik is the k X k identity matrix, 1^ is the vector of length k with 
all its entries as 1, J/% = lfell is the square matrix with all its entries as 1. 
We further define Bk = Ik — Jk/k, to be the i x j matrix with its upper 
left corner filled with the submatrix while the remaining entries filled 
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with 0, and B\ = B^. The notation of l\j and if are defined in the same 

fashions as B\^ and B\ . Finally, g) represents the Kronecker product of two 
matrices. To make the problem resolvable, it is necessary to make two mild 
assumptions as follows. 

Assumption 1. Once a subject drops out of the study, the probability 
that the subject reenters the study is zero. 

By Assumption 1, we are able to define l{, 1 < % < n, to be the total 
number of periods that subject i stayed in the experiment. Further it is 
realistic in a large number of applications to assume the following: 

Assumption 2. The dropping out mechanism is independent of the 
choice of design d as well as the outcome of the experiments. Moreover 
{h, 1 <i <n} are i.i.d. 

By Assumption 2, we could define a k to be the probability that U = k, 
1 < k < p, and hence we are in place to define the following technical terms: 

• a = (ai,a 2 ,..-,a p ). 

• a jk = Y^i=j a ii 1 <j <k <p. (Convention: a p+ ± :P = 0.) 

• m = min{/c : afc > 0}. 

• afc = n -i(( n+ i) afc+a «+^ i _ a ™+ 1 ) ) \<k<p. 

• /3k = a k + afc+i^a^ - a fcp a i k-v l<h<p. 

• A - = Hl=i a k B p- 

• B = Yl =1 PkB k p . 

Definition 1. An experiment is said to be complete if there is no 
dropout. 

By definition the complete experiment is a special case in our framework 
and has been extensively studied in literature. Here, we aim to investigate 
desirable designs for any given dropout mechanism a. 

Notice that A and B are both nonnegative definite matrices. Since (3k > 
a k + a k+i,p a ik ~ a fcp°ifc = a fc(l — ^ifc) > 0, we have B > 0. By the mean value 
theorem one could show that ak > and hence A > 0. Note that ak = 
implies a k = f3 k = 0. Hence we have A = YX=m a k B p and B = Ylk=m fikB p . 
The same representation will be adopted in the sequel whenever the sum- 
mation over the period k is involved. Finally, we should be aware of the 
differences and relationships among the matrices B k , Bp and B. 

2.2. Optimality criteria. Writing the np x 1 response vector as = 
(Ydii,Y d2 i, . . . , Y dp i, Y d i2, Y dpn )', model (1) can be written as 

(2) Y d = l npf i + ZTr + U<; + T d T + F d <y + £, 
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where tt = (^,...,7^)', s = (ft, . . . r = (n, . . . , T t )', 7 = (pi, ■ ■ ■ , Pt)', 
Z = l n <g) I p , U = I n ®l p and T d and F d denote the treatment/subject and 
carryover /subject incidence matrices. Here Ee = and Var(e) =a 2 I np . For 
design d under a realization of experiment I = (1%, . . . , l n )' , the information 
matrix for the direct treatment effects r under model (2) with a 2 = 1 is 

C d (r,l) = {MT d )' W L (MZ\MU\MF d ){MT d ) 

= OmG) - C dl2 (l)[C d22 (l)rC d21 (l), 

where 

C dn (l) = T' d OT d , C dl2 (l) = T d OF d , 
C d 2i(l) = C' dl2 , C d22 (l) = F d OF d , 
M = diag(//; p ,i = l,2,...,n), 

O = M' vt l {MZ\MU)M. 

Under a complete experiment, Kiefer (1975) defined a design to be uni- 
versally optimal if it maximizes $(Crf(r,pl n )) for any $ satisfying: 

(C.l) $ is concave; 

(C.2) $(S'CS) = $(C) for any permutation matrix S; 
(C.3) <&(bC) is nondecreasing in the scalar b > 0. 

Optimality criteria defined by such a $ includes, but is not limited to, A, 
D, E and T. See Kiefer (1975) and Yeh (1986) for instance. In the subject 
dropout setup there does not exist a design which maximizes <&(C d (r,l)) 
for all realizations of I. One reasonable target is to find a design which 
maximizes 4>o(d\Q,a) :=E d &(C d (T,l)) for any $ satisfying the above three 
conditions. Here the expectation is taken over the probability space of I with 
parameter a. For notational simplicity, we would omit the subscript a for E 
and the parameters $ and a for <f>Q whenever it is clear from the context. So 
we have (f> {d) := (f) (d\$ , a) = E$(C rf (r, /)). 

There are two major difficulties in maximizing (f>o(d) which make the 
problem intractable, if not impossible: (i) $ is a nonlinear function and 
hence the expectation would interact with the form of <&. (ii) Even when 
the dropout situation I is fixed, there is still a lack of tools to deal with the 
information matrix C d (r,l) if subjects drop out at different periods under I. 
In order to tackle these difficulties, we propose to replace the original target 
function of 4>o(d) with the surrogate target function of 4>i(d) = $>(C d ) where 

C d = C d n — C d i 2 C d22 C d2 i, 

(3) 

C dij =EC dij (l), l<ij<2. 

It will be shown in Section 5 that this replacement is very successful in 
identifying highly efficient, if not optimal, designs for the criterion (j>o(d). 
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For i = or 1 , let d* be an optimal design under <fii . Then define (d) = 
(d)/(f)i (d* ), i = 0, 1, to be the efficiency of d under ^-criterion. Also we call 
g(d) = 4>o(d) / 4>i(d) to be the gap function between the two target functions 
for design d. Even though we are working on <p\ instead of </>o, the <j)o- 
efficiency eo(d) could be bounded by e\{d)g{d) as shown by Lemma 2. 

Lemma 1 [Pukelsheim (1993), pages 74-77]. The Schur complement of 
a matrix G > is a concave nondecreasing function of G. 

Lemma 2. For any <J>, a and design d, we have 4>o(d) < 4>\{d). Further 
we have 

eo(d) > ei(d)g{d). 
In particular, for any <\>\-optimal design d, we have eo(d) > g(d). 

Proof. By Lemma 1 we have E,Cd(r,l) < Cd- Then we have 

Md) = ^Hc d (T,i)) 

(4) <$(EC d (r,Z)) 

(5) < $(C d ) 

(6) =Md)- 

By (6) we have e (d) = Md)/Md* ) > MQ/M<%) > Md)/M^i) = 
ei(d)g(d). □ 

By (6) we have g(d) < 1, and hence g(d\) < eo(d*). That means if we could 
find a $>i-optimal design, then the value of the gap function g evaluated at 
this design serves as a lower bound of its (/>o-efficiency. Inequalities (4) and 
(5) are essentially Jensen-type inequalities. The equalities therein both hold 
if the realization of subject dropout, I, is not random. When the variation 
in I is not very large, it would be plausible to work on the surrogate target 
of maximizing (J)i(Cd) instead of (J)o(Cd) since the value of the gap function 
g would be close to unity. Note that a popular choice of $ is the trace of a 
matrix (T-criterion) , for which the equality in (4) always holds. 

When the experiment is complete, the necessary and sufficient conditions 
for -universal optimality derived in Section 4 reduce to that of Kushner 
(1997b). Note that the matrix C d in (3) is no longer an information matrix 
for any design, and as a result the ideas of proving the existence of univer- 
sally optimal designs, given by Theorem 3.4 of Kushner (1997b), are not 
applicable here. However, we found that similar results could be derived by 
direct manipulation on the matrix Cd- See Sections 3.2 and 3.3 for details. 
Moreover, since Ay^Bm general, the arguments in deriving the linear equa- 
tion as in proof of Theorem 5.3 of Kushner (1997b) are not applicable here 
either. For the approach of tackling this difficulty, see Section 4.1 for details. 
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2.3. Preliminary results. 

Lemma 3. Under Assumptions 1 and 2 we have C d \\ = T' d VT d , C d \2 = 
T' d VF d and C d22 = F' d VF d with 

v 

V=Y,( ( *kIn-n- 1 hJ n )®B k p 

, s k=m 

(7) 

= I n ®A-n J n ®B. 

since 5 1 = the m in (7) could be replaced by max(m, 2). A heuristic 
explanation for this observation is that when h = 1 there is no information 
gained from this subject, because we rely on within subject comparison for 
treatments in crossover designs. When the experiment is complete we have 

— flk — for all 1 < k < p — 1 and a p = f3 p = 1. In this case, we have the 
reduction of A = B = B p and V = B n <g> B p , for which the optimality problem 
has been extensively studied in literature. 

Corollary 1. Any design which is 4>\-optimal with $ satisfying condi- 
tions (C.l)— (C.3) under model (1) is still optimal under the same criterion 
when the within subject covariance is of the form 

(8) E = I p + ril' p + l p r/. 

One special case is the compound symmetric covariance matrix, that is, E = 
I p + bJ p . Here rj is an arbitrary vector, and b is an arbitrary real number. 

Proof. Let be the k x k upper left submatrix of S for 1 < k < p. 
By direct calculation, we have 

(9) E^-E^JfcEfcVl'fcE^l^Bfc. 

By following the same calculation as the proof of Lemma 3, the corollary is 
established in view of equation (9). □ 

Remark 1. The covariance matrix as in (8) is called a "type-H" matrix; 
see Huynh and Feldt (1970). 

3. </>i -universal optimality. This section explores the (^-universal opti- 
mality in approximate design theory, where -universal optimality is de- 
fined as follows. 

Definition 2. Given p, t, n and a dropout mechanism a, a design d is 
said to be (/^-universally optimal if d maximizes (j>i(d) over all designs for 
any $ satisfying conditions (C.l), (C.2) and (C.3). 
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Section 3.1 introduces the ideas in approximate design theory as well as 
the concept of symmetric designs. Section 3.2 shows that a design would 
be (f>\ -universally optimal as long as its information matrix is of the form 
Cd = ny*B t /(t — 1) with y* introduced by equation (14). Section 3.3 shows 
that there always exists a symmetric design which satisfies this sufficient con- 
dition for 0i -universal optimality, and further by argument of Kiefer (1975) 
that this condition is also necessary for any design to be <f)\ -universally opti- 
mal. However, this condition is not immediately applicable for application. 
Section 4 gives an equivalent condition which is more readily applicable. 
Some relevant technical preparations are given in Sections 3.4 and 3.5. 

3.1. Approximate design theory and symmetric designs. A design d with 
p periods, t treatments and n subjects could be considered as the result of 
selecting n sequences with replacement from the collection of all possible 
t p sequences, and this collection is denoted by S. Let n s be the number 
of replications of sequence s in the design, and define Pd = (p s ,s G <S) with 
p s = n s /n. When we ignore the ordering of the n sequences in the design, 
we have the one to one correspondence of d o (n, Pd) with the restrictions 
of (i) ^2 s£ gPs = 1, (h) Ps > and (iii) np s being an integer for all s. In 
approximate design theory, we only keep the first two restrictions and allow 
np s not to be an integer. 

Let a be a permutation of symbols {1,2,..., t}. For a sequence s = (t\, . . . , 
t p ), we define as = (a(ti), . . . , a(t p )). Then the design ad is defined by P a d = 
(Pcr- 1 si s € S). The permutation matrix S a is the unique matrix satisfying 
T as = T s S a for all s € S. In the sequel we replace the subject index u by 
sequence index s whenever it is necessary. 

A design d is said to be symmetric if Pd = P a d ■ Also we define symmetric 
blocks as (s) = {as, a 6 V} where V is the collection of all possible t\ permu- 
tations, that is, \V\ = t\. We further define p^ = Ylse(s)PS- ^ or a symmetric 
design, we have p^ = P( s )/\{ s )\ f° r an Y s € (s). Given p,t,n, a symmetric 
design d is uniquely determined by (p( s ), s(e)<S), where s(g)5 means that s 
runs through all distinct symmetric blocks contained in S. 

3.2. A sufficient condition for (f)\- universal optimality. Denote by T u 
(resp., F u ) the p x t submatrix of T (resp., F) corresponding to the uth 

subject. Define T = n~ l X^«=i T u = T u B>t and T = TBt. The notation F, 
F u and F are defined in the same way corresponding to carryover effects. 
Let C dij =Y, tr& ,S' a C di jS IT /\V\, l<ij<2 and C d = £o67> S' a C d SJ\P\. 
Note that Cdij, 1 < J < 2, are completely symmetric, also Cd\\ and Cdxi = 
(Cd2i)' have row and column sums as zero. Let I be the indicator function. 
By Proposition 1 of Kunert and Martin (2000), we have 

(10) c d < c d 
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where 

Cd = Cdll — Cdl2{Cd22) Cd21, 

c dij = ti(B t C dlj B t ) = tr(B t C dij B t ), l<ij<2. 

Define C di j = YZ=i Cuij, where C uij = G[AGj with d = f u and G 2 = F u . 
Since B > 0, we have 

{B t C dlj B t)l < hj < 2 =( d ^ i dl2 )-n( T j\ B {% f d ) 



\ Cd2\ Cd22 J \F', , 

(12) 

< {Cdij)l<i,j<2- 

Define q dij = tr(C dij ) and = tr(C uij ). Then we have q di j = YZ=l Quij- It; 
is easy to see that ^22 > and hence ^22 > 0, which allow us to define 

n* — r, ^12 

Qd - Qdll • 

Qd22 

By (12) we have 

Cdll Cdl2 \ < ( Qdll Qdl2 



.Q21 Cd22 / \Qd21 Qd22 

and then by Lemma 1 we have 

(13) can - ^ Wo] < Qd, 

Cd22 

with the equality holds when Td = F d = 0. The latter is achieved by designs 
which are uniform on periods. To introduce the following theorem, we define 

(14) y* = -maxq d . 

n d 

Theorem 1. If Cd = ny*B t /(t — I) with y* defined in (14), then the 
design d is cpi-universally optimal. 

Proof. By conditions (C.l) and (C.2) of <3? we have 

(15) HC d )<^(C d ), 

where the equality holds if Cd is completely symmetric, that is, Cd = tv(Cd)B t /(t- 
1) since Cd has row and column sums as zero. The theorem is proved in view 
of (10), (11), (13), (14), (15) and condition (C.3). □ 
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3.3. Existence and equivalence. Theorem 1 provides a sufficient condi- 
tion for a design to be ^-universally optimal. A natural question is the 
following: does there exist such a design? This section gives a positive an- 
swer as well as its corresponding implications. 

Theorem 2. For any symmetric design, we have: 

(1) C d is completely symmetric; 

(2) tr(C d )=q d ; 

(3) given any design d there always exist a corresponding symmetric de- 
sign which has the same value of q d . 

Remark 2 . Note that Theorem 2 does not hold if we replace C d therein 
by C d (r,l). Hence the argument cannot be applied to ^(C d (r, I)) directly. 
This is why we work on <f>i instead of (fro directly. 

Corollary 2. (i) There exists a symmetric <fri-universally optimal de- 
sign d with 



(16) C d 



ny*B t 
t-1 ' 



(ii) If a design d is (fri-universally optimal (or (fri-optimal with strictly 
concave or increasing), then we have (16). 

Proof, (i) is proved by Theorems 1 and 2. (ii) is proved by (i) and the 
remark in Kiefer's (1975) Proposition 1. □ 

3.4. A necessary condition for (fr\-universal optimality. In this section 
we give a necessary condition for a design to be <fr\ -universally optimal and 
define quantities that will be useful for presenting the necessary and sufficient 
conditions for (^-universal optimality in Section 4. Now define the function 
q s (x) = q s ii + 2q sl2 x + q S 22X 2 and q d (x) = q dll + 2q d i 2 x + q d 22X 2 ■ Since q dij = 
E£=i*«j = n T ls esPs < lsij we have 

(17) q d (x) =n S ^p s q s (x). 

s£S 

Since q d 22 > 0, by direct calculation we have 

q* d = m.mq d (x) 

(18) 

= nmin V ]p s q s (x). 

By (14) and (18) we have 

(19) y* = maxmin> p s q s (x). 

P X ' 

sG5 
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Let d* be a design which maximizes q d . By (17), (18) and (19) we have 
min^ q d * (x) = q d * = ny* . Since q d2 2 > the equation q d *{x) = ny* has a 
unique solution which is denoted by x*. Define 

T={seS:y* = q s (x*)}. 
Lemma 4 shows that any universally optimal design is supported on T ■ 

Lemma 4. If a design d is (f)\-universally optimal (or (pi-optimal with 
<3? strictly concave or increasing) then we have 

Ps = o, sir. 

Proof. By Corollary 2, we have tr(C d ) = ny* and C d = C d . By (10), 
(11) and (13) we have tr(C^) < q d . The theorem is proved in view of (14) 
and Section 4.4 of Kushner (1997b). □ 

3.5. Determination of x* , y* and T . For a sequence s = (ti, t2, ■ ■ ■ , t p ), 
define s^ = (ti,...,t^) to be the first k periods of s. Particularly, we have 
s = s p . For 1 < k < p and 1 < i < t, we define the treatment /sequence index 
fs k ,i = Sj=i l*j=i- To introduce the following theorem, we define two special 
symmetric blocks. The symmetry block (di) consists of all sequences having 
distinct treatments in the p periods. The symmetry block (re) consists of 
all sequences having distinct treatments in the first p — 1 periods, with the 
treatment in period p—1 repeating in period p. 

Theorem 3. For any integer k > t, define and r& to be integers 
satisfying k = z^t + r\. and < < t. 

(i) If m> t and 

v 

(20) a k [k(mt -t 2 + l-k) + t- r k (t - r k + 1)] > 0, 

k=m 

then 

x* = 0, 

V 

y*=Y, a k[k(l - 1/t) - r k (t - r k )/pt], 

k=m 

T = {s:f Sk ,i = z k or z k + 1,1 <i< t,m <k<p}. 

(ii) If p<t and 
P-i 

(21) Yl a k( k ~ 1 )(P + W ~ k )< «p[(P - !) 2 - (! + V*)P + 1/*], 

k=m 
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then 

x* = l/(p-l), 



V- u {, 2p-l-k + l/t \ 



y = 

k=m 

T = (re) U (di). 

When the two sides of (21) are equal, we have T = (re). 
(iii) Let 

n=mMk-m-i-i/ty 

If (P - I)" 1 <x <(p- 2)~ l , then 
x* = x , 

v p-i 

y* = E afc ^ - - V* - 1 / kt )4 - 2 E ak{l ~ V^o 

k=m k=m 
P 

+ Y / a k (k-l)-2/p, 

k=m 

T = (re). 

Remark 3. Under complete experiment, Theorem 3(i) applies to the 
case p > t, and Theorem 3 (ii) applies to the case p < t. Actually Theo- 
rem 3(i), (ii) reduce to Theorem 1 of Kushner (1998). One can extrapolate 
by continuity that Theorem 3(i), (ii) cover the cases when the dropout issue 
is not very serious. 

Remark 4. When m = p — 1, we would also discuss parts (i) and (ii) 
of Theorem 3. For (i), a sufficient condition for (20) is p > t + 3. For (ii), 
inequality (21) simplifies to 

!00 , . (p-2)(l + l/t) 

K ' p ~ (p- l) 2 -2 -1/t 

The right-hand side of (22) mainly depends on p, and it will become very 
small for large p. Particularly, a sufficient condition for (22) is 

n (p-2)(l + l/t) 1 
a v > — —-, TTo — -z. — —r. + 



p ~ n + l(p-l) 2 -2-1/t n + 1' 

4. Linear equations for ^-universal optimality. Built upon the results 
of Section 3, this section provides feasible equivalent conditions in approxi- 
mate design theory for (^-universal optimality. 
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4.1. Equations for general designs. Recall that T u = T u Bt and F u = 
F u Bt, and then we define 

Cd = Cdii — CdnCjnnCduii 

(23) . « . 

&dij = ^ ] C u ij, 1 < i, J < 2, 

u=l 

where 

Oui = 2^,4 - £?)T U + f' u BT u , C ul2 = T'JyA - B)F U + f' a BF u , 

Cu2i = C' ul2 , C u22 = F' U {A — B)F U + F' U BF U . 

We shall replace C u ij with C s ij in emphasizing sequence s instead of subject 
u of a design. By direct calculation we have 

(24) C dij = C dij + nG'iBGj , 1 < i, j < 2, 

where Gi = T d and G2 = F d . The following lemma is crucial for the proof 
of Theorem 4. 

Lemma 5. If d is (pi-universally optimal (or <pi -optimal with <I> strictly 
concave or increasing), we have C d = C d = ny*Bt/(t — 1). 

Proof. By (24) and Lemma 1 we have 

(25) C d < C d . 
By Corollary 2(ii) we have 

(26) C d = ny*B t /{t-l). 

Let d be the symmetrized version of design d as defined by (48), and then 
by (23) we have 

(27) Y< S 'Ai S °/\' P \= G dir 

Again by (24) we have C dlj = G7-AG7, with Gi = {f' d ,T d )', G 2 = (P d ,Ftf, 
and 



A 



nB 
V 



Since A > we have by Proposition 1 of Kunert and Martin (2000) that 

(28) ^s' a c d s a /\r\<c d , 
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in view of (27). Since Tg = F d = for the symmetric design d, we have 
(29) C- d = C g . 

Combining (25)-(28) and (29), we have 
ny* 



t-l 



Bt = Cd = Cd 



<Y,s' a c d sj\v\ 



cr£V 

Hence we have Cg = ny*Bt/(t — 1) in view of Corollary 2 and thus 
J2S' a C d S a /\T\=ny*B t /(t-l), 

<t<EP 

which in turn yields 

(30) tr(C d )=ny*. 

The lemma is now proved in view of (25) and (30). □ 

Theorem 4. A design d is §\-universally optimal (or fa-optimal with 
$ strictly concave or increasing ) if and only if 

(31) 5>«[&il + x*C sl2 B t ] = j^jB t , 

(32) ]T^(7 s21 + x*GW? t ]=0, 
s&T 

(33) ^p s B(f s + x*F s )=0, 
seT 

(34) I>» = 1 ' 

(35) p s = 0, siT. 

Based on Theorem 1 and Corollary 2, (16) is also a necessary and sufficient 
condition for (/^-universal optimality. However, (16) is not directly applicable 
for identifying designs. Note that the conditions in Theorem 4 are merely 
linear equation systems for p s , and hence can be easily implemented to derive 
exact designs. See Section 5. 

4.2. Equations for symmetric designs. Note that q s (x) is invariant to 
treatment permutation, that is, 

(36) q s (x)=q as {x). 
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Combining Theorem 4.5 of Kushner (1997b), Theorem 2, Corollary 2, Lemma 4 
and equation (36), we have the following: 

Theorem 5. A symmetric design is <p\ -universally optimal if 

s (e)T 

s(e)T 

p s = 0, s£T, 
where q' s (x) is the derivative of q s (x) with respective to x. 

5. Exact designs. This section gives algorithms to identify efficient exact 
designs based on the optimality equations in Section 4. Results are compared 
to designs proposed in literature. For the matrix Cd(r, Z), denote its eigen- 
values by = Ai < A2 < • • • < At. We define the criteria of A, D, E and T 
as: 

. ^(C d (T,Z)) = (t-l)(nE* =2 Ar 1 )- 1 . [A 2 = implies $ A (C d (r,Z)) = 0]; 

. $D(Q(r,Z)) = n- 1 (n- =2 A«) 1/(t - 1) ; 
. <S> E (C d (T,l)) = n- 1 \ 2 ; 

. <s> T (c d (T,i)) = wt-i)r i (j: t i=2 \ l ). 

Section 5.1 provides an algorithm to derive exact designs for general config- 
urations of p, t,n. Section 5.2 illustrates how to derive symmetric designs by 
straightforward calculations. In utilizing Lemma 2, e\{d) is further bounded 
by e.\ = <f>\(d) / <f>i{d) , where d is a (^-optimal design in asymptotic design 
theory which may not necessarily exist as an exact design. Thus the function 
£(d) = e~\{d)g{d) serves as a feasible lower bound of eo(d). 

5.1. General exact designs. This section gives an algorithm to derive 
efficient exact optimal designs for any given configuration of p, t, n and com- 
pares them to designs in literature. Note that the latter designs are proposed 
for judiciously chosen p, t, n while our algorithm works for any configuration 
of p,t,n. Even under these chosen circumstances our designs are still shown 
to be more efficient and robust. By Theorem 4 we have the following: 

Corollary 3. A design d is §\-univer sally optimal (or §\- optimal with 
<3? strictly concave or increasing ) if and only if 

(37) Y ns ^ sll + x * Gs ^ B ^ = TZi B ^ 

ser 
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(38) J2n s [C s21 + x*C s22 B t ]=0, 

(39) ^n s B(f s + x*F s ) = 0, 

(40) n * = n ' 

(41) n s = 0, s i T. 

Note that an exact design satisfying equations (37)-(41) does not neces- 
sarily exist due to the discrete nature of the problem, especially when the 
dropout mechanism is arbitrary. However, as shown by the following exam- 
ples, it is plausible to find a design which is as close to satisfying equations 
(37)-(41) as possible. Specifically, let Nj- = {n s ,s € T}', and then equations 
(37)— (39) could be written in a matrix form as 

X T N T = Y T , 

with Xj- and Yj- uniquely determined by equations (37)-(41) and the order- 
ing of the n s in the vector Nf- To find an efficient design for an arbitrarily 
given n, one could choose a design which 
Minimizes 

(42) ||X r iV r -y r ||, 
subject to 



n. 



Here || • || is a norm for a vector. For all subsequent examples in this section, 
we take || • || to be the Euclidean norm. Then solving for (42) is straight- 
forward by utilizing integer optimization packages/softwares. Note that the 
computational complexity of the above minimization problem depends on 
|T|, which in turn depends on p and t only. 

Besides maximizing the expectation (j>o(d) = E$(C^(r, I)), one might also 
be interested in minimizing the variance V$(d) = Var(<E> (C^(r, I))) to achieve 
robustness. To compare two designs under these two functions, we define 
Md,d') = Md)/Md') and V*(d, d') = V<z,(d)/V$(d'). 



5.1.1. Comparisons to designs of Low, Lewis and Prescott (1999). The 
setup and target of Low, Lewis and Prescott (1999) are the same as in this 
paper. However, they searched all combinations of Latin squares for the 
special cases of p = t = 4, n = 16 and p = t = 4, n = 24 only. 
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(b) Lower Bound of Efficiency 
(a) Lower Bound of Efficiency^or d for d by Our Algorithm 




0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 

e e 



Fig. 1. The letters A, D, E and T represent the choice of criteria function <J?. (a) The 
lower bound of efficiency £(di) for a = (0,0,6*, 1 — 8) with 8 £(0,1). (b) The lower bound 
of efficiency 1(d) with d obtained by algorithm (42). Particularly 8 = 1/2 implies d = d2- 
(c) The ratio of mean: <f>o(d) /4>o(di) . (d) The ratio of variance: V<s>(d) /V$(di) . 

When p = t = 4 and n = 16, they proposed a design as shown by Fig- 
ure 1(b) therein, which is said to be d\ here. By algorithm (42), the dropout 
mechanism a = (0,0, 1/2, 1/2) yields cfo. 

2123343211124443 
4434112122343321 
21 3211234433412214' 
3211234444231132 

Tables 1 and 2 summarize the performances of designs d\ and c?2 under 
criteria of A, D, E and T. Since eo(d) > 1(d) = e±(d)g(d), a design d would be 



18 



W. ZHENG 



Table 1 

Performance of di under a = (0,0,1/2,1/2) 





4>o(di) 


V*(di) 


ei(di) 


g(di) 


£(di) 


A 


0.6646345 


0.07223834 


0.9558432 


0.9592851 


0.9169261 


D 


0.6747419 


0.06776632 


0.9603310 


0.9693223 


0.9308702 


E 


0.5528575 


0.09039916 


0.8960473 


0.8512042 


0.7627192 


T 


0.6848634 


0.06334558 


0.9650531 


0.9790485 


0.9448338 



(^-efficient if both e\(d) and g(d) are close to unity. Algorithm (42) focuses 
on ei(d) and provides a satisfactory solution in view of the column of e\ in 
Table 2. We observe that the values of g in both of these tables are very 
close to unity except for ^-criterion. Notice that the values of gap function 
g for T-criterion are always the largest among all criteria, which is due to 
the linearity of T-criterion. 

In comparison, cfo is more efficient and robust than d\ under all criteria 
in view of the columns of <f)Q and V$ , respectively. A lesson from the latter 
is that a design with a more diverse composition of sequences is generally 
more robust. Here in d2, only the sequences of 1234 and 4321 appear twice 
while each of the remaining sequences appears only once. Low, Lewis and 
Prescott (1999) had similar observations. 

We now consider the performance of a design obtained by algorithm (42) 
for dropout mechanisms of the form a = (0,0,9,1 — 0), < 9 < 1. By heuris- 
tic arguments in Section 2.2, the value of gap function g would be smaller 
if there is larger variability in I. This is supported by the [/-shape curve of 
£(d) in Figure 1(b). From Figure 1(a), we see that the efficiency of d\ has a 
reverse relationship with the value of 9. Figure 1(c) shows that the advan- 
tage of our algorithm against d\ is more obvious when there is large chance 
of dropout. This means that our algorithm succeeded in adapting the choice 
of designs to different dropout mechanisms. Figure 1(d) shows that the de- 
sign by our algorithm is also more robust than d\ against the randomness 
of subject dropout. When p = t = 4 and n = 24, Low, Lewis and Prescott 
(1999) proposed a design which consists of two copies of three distinct 4x4 
Latin squares, which is denoted by d% here. When a = (0, 1/10, 2/5, 1/2) our 



Table 2 

Performance of di under a — (0, 0, 1/2, 1/2) 





</>o(e£ 2 ) 


V*(d 2 ) 


ei(d 2 ) 


g(d 2 ) 


£(d 2 ) 


A 


0.7058735 


0.05266523 


0.9989759 


0.9748175 


0.9738192 


D 


0.7094851 


0.05129209 


0.9991830 


0.9796020 


0.9788017 


E 


0.6337475 


0.06979073 


0.9848636 


0.8877519 


0.8743145 


T 


0.7130567 


0.05005383 


0.9993922 


0.9843273 


0.9837291 
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Table 3 

Performance of d$ and d^ under a = (0, 1/10, 2/5, 1/2) 





4>o(d,4,) 


V*(d 4 ) 


ei(d 4 ) 


9(^4) 


£(d*) 


4>o(d,4, d 3 ) 


V*(d4, d 3 ) 


A 


0.6791 


0.0526 


0.999983 


0.9777 


0.9777 


1.0112 


0.9705 


D 


0.6822 


0.0516 


0.999983 


0.9822 


0.9821 


1.0115 


0.9631 


E 


0.6118 


0.0648 


0.999979 


0.8809 


0.8809 


1.0089 


1.0386 


T 


0.6852 


0.0506 


0.999983 


0.9866 


0.9869 


1.0118 


0.9562 



algorithm yields d^ which consists of one copy of the first twelve sequences 
and two copies of the last six sequences of (43). According to the last two 
columns of Table 3, d^ outperforms d^ in terms of both efficiency and ro- 
bustness with the exception for the robustness under S-criterion. 



(43) 



222344233443 
343432344222 
431121112134 
431121421311 



342 111 
111243 
234432 
234324 



x 2. 



5.1.2. Comparison to designs of Bose and Bagchi (2008), Majumdar, 
Dean and Lewis (2008) and Zhao and Majumdar (2012). When the re- 
alization of subject dropout I is not random, we have (f>o = (j>±. In this case, 
Bose and Bagchi (2008) have the following results: 

(1) When p = t > 3 is a prime or primer power and n = t(t — 1), a design 
is found to be universally optimal whenever a q = 1 for any 3 <q <p. 

(2) When p = t > 3 is a prime or primer power, t = 3(mod 4) and n = 2t, a 
design is found to be universally optimal whenever a q = l with q = (p + l)/2 
or p. 

(3) When p = t > 3 is a prime or primer power, t = l(mod 4) and n = 4i, 
a design is found to be universally optimal whenever a q = 1 with q = (p+ 1)/2 
or p. 

For example, when t = p = 5 the smallest n should be At = 20. In this case 
the design proposed by them is universally optimal, either when the exper- 
iment is complete or when all subjects immediately drop out after period 3 
with probability 1, that is, 03 = 1. We denote this design by d§ which is given 
by Example 3 of Bose and Bagchi (2008). When a = (0,1/20,3/20,1/5,3/5) 
algorithm (42) yields d§ as follows: 

12443212113245555343 
25121323544554423131 
d 6 : 3435544533111221225 4. 
41532531225331144425 
41532154452423331512 
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Table 4 

Performance of d$ and de under a = (0,1/20,3/20,1/5,3/5) 





4>o(de) 


V*(d 6 ) 


ei(de) 




£(d ) 


4>o(d e , <f 5 ) 


V*(d 6 , eJ 5 ) 


A 


0.7555 


0.05944 


0.99888 


0.97848 


0.97738 


1.00117 


0.98842 


D 


0.7589 


0.05827 


0.99891 


0.98277 


0.98170 


1.00172 


0.98333 


E 


0.6712 


0.07399 


0.99091 


0.87621 


0.86825 


0.99449 


1.03435 


T 


0.7621 


0.05719 


0.99894 


0.98700 


0.98595 


1.00224 


0.97877 



Table 4 shows that d§ is more efficient and robust than d§ under criteria of 
A, D and T, while the result is reversed under the criterion of E. The reason 
for the latter is that d§ did a better job in avoiding disconnected designs 
under subject dropout, that is, $£;(C^(t, /)) = 0. 

Since the magnitude of the differences between c?5 and d§ are small in 
terms of both efficiency and robustness, we conclude that the designs of 
Bose and Bagchi (2008) successfully defended the loss of information due 
to subject dropout. The same conclusion applies to Majumdar, Dean and 
Lewis (2008) and Zhao and Majumdar (2012) since they use similar ideas. 

5.1.3. Comparisons to designs of Kushner (1998). Kushner (1998) de- 
rived conditions for universal optimality as a special case of ours under 
complete experiment. Particularly, when t = 3, p = 5 and n = 30, Example 4 
of Kushner (1998) gives a design satisfying the optimality equations therein, 
which is denoted d-j here. When a = (0,0,1/3,1/3,1/3) our algorithm gives 
d$ which consist of five copies of (44) , 

123312 
332121 

(44) d 8 : 2 1 1 2 3 3 x 5. 

2 11233 
123321 

Based on Table 5 ds outperforms dj in terms of both efficiency and ro- 
bustness even though dj is universally optimal under complete experiment. 



Table 5 

Performance of d^ and ds under a= (0,0, 1/3, 1/3, 1/3) 





4>o(ds) 


V*(d s ) 


ei(ds) 


g(ds) 


£(d s ) 


4>o(ds,d7) 


Vif.{ds, dr) 


A 


1.2340 


0.053908 


1 


0.99591 


0.99591 


1.11018 


0.57705 


D 


1.2347 


0.053736 


1 


0.99643 


0.99643 


1.10598 


0.59362 


E 


1.2004 


0.059782 


1 


0.96877 


0.96877 


1.16339 


0.51397 


T 


1.2353 


0.053573 


1 


0.99696 


0.99696 


1.10177 


0.60992 
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Table 6 

Performance of under a = (0, 0, 0, 0, 2/5, 3/5) 







V*(dg) 


ei(d 9 ) 


9(^9) 




A, D, E, T and etc. 


2.7368 


0.09152 


0.99511 


0.997823 


0.99295 



5.2. Symmetric exact designs. This section illustrates the usage of The- 
orem 5 in deriving efficient symmetric exact designs. By Remark 4 in Sec- 
tion 3.5, when t = 2, p = 6 and m = p — 1 = 5, inequality (20) in Theorem 3 
always holds regardless of the value of a. By applying Theorem 3(i), we have 
x* = and hence q' s (x*) = 2q s i2- Moreover, it is easy to see that the support 
T essentially contains all sequences which assign a subject to each of the two 
treatments for 3 out of the total of 6 periods, and hence \T\ = 20. Within 
each symmetric block, there are two sequences since t = 2. Hence there are 
10 symmetric blocks. However, it is not necessary to include all these sym- 
metric blocks in the design. Particularly when a = (0,0,0,0,2/5,3/5), we 
have q' Sl (x*)/q' S2 (x*) = q sll2 /q S2 i2 = -6-01 for s x = 122121 and s 2 = 122211. 
In the spirit of Theorem 5 we propose a small sized design, dg, which consists 
of one copy of sequences 122121 and 211212 and six copies of the sequences 
122211 and 211122. So we have n = 14 for dg. The point is that we have 
the freedom of selecting different subclasses of T ■ The performance of dg is 
given in Table 6. It shows the high efficiency and robustness of dg. Note that 
when t = 2 all criteria are equivalent. 

6. Discussions. Subject dropout is a very important issue in planning a 
crossover design. It is shown by Table 5 and other examples in literature 
that an optimal design under complete experiment is no longer optimal 
and possibly even disconnected when there is subject dropout. However, 
the problem has received very limited attention in literature so far, and the 
majority of the research assumes that there is no subject dropout. Bose and 
Bagchi (2008), Majumdar, Dean and Lewis (2008), Zhao and Majumdar 
(2012) all considered the nested structure such that a design, together with 
its subdesign, obtained by taking only the first q(< p) periods, are both 
optimal or efficient. Naturally such designs would still be efficient when all 
subjects drop out at periods between p and q. The issue with this approach 
is that we lose adaptation to different dropout mechanisms. Furthermore, 
their methods only apply to special configurations of p,t,n. 

In order to take into account the dropout mechanism, one has to make 
assumptions to formulate the dropout mechanism. This paper adopts two 
mild assumptions and works on the target function (j)g which is given by 
taking the expectation of a regular optimality criterion with respect to a 
given dropout mechanism. Actually Low, Lewis and Prescott (1999) have 
followed the same approach. However, they only provided two case studies, 
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and there were no theoretical results regarding how to identify an efficient 
design in general. The latter problem is itself intractable. To tackle it, we 
propose to use the surrogate target function of 4>\ in place of 4>q. It turns 
out that this replacement is very successful. Examples in Section 5 show 
that ^i-optimal (or highly efficient) designs are also highly efficient under 
Moreover, these designs are also shown to be very robust against the 
randomness of subject dropout due to the substantial diversity in the com- 
position of treatment sequences. 

Theoretically, we derive feasible, equivalent conditions for a design to be 
<pi -universally optimal in asymptotic design theory. These conditions are es- 
sentially linear equations with respect to proportions of treatment sequences 
from T, a subclass of all possible treatment sequences. A solution for the 
equations, which yields an exact design, does not necessarily exist due to 
the discrete nature of the problem. However, one can follow the spirit of the 
conditions and easily propose an applicable algorithm to derive an efficient 
exact design for any criterion and any configuration of p,t,n. In this pa- 
per, we adopt algorithm (42) for general designs as well as the approach in 
Section 5.2 for symmetric designs. 

The problem of identifying exact designs for large values of p and t remains 
as an open problem. The critical difficulty is that as p and t grow the size 
of the support for admissible sequences, |T|, increases very fast. Typically 
T contains two distinct symmetric blocks, in which case p = t = 6 usually 
yields |T| =2x6! = 1440. That means the majority of the sequences in T 
would not appear in the design for a moderate value of n. The same issue 
has appeared in Kushner (1997b). If we adopt the approach of symmetric 
designs as in Section 5.2 we would need n to be as of the same magnitude as 
|T|. On the other hand, algorithm (42) is essentially an integer programming 
problem and the number of the integer variables is equal to \T\. Hence it 
would be infeasible for a computer to handle when |T| is too large. For this 
problem, one possible solution is to reduce the size of T through the study 
of intrinsic relationships among treatment sequences. Another approach is 
to resort to algorithm improvement. 

7. Proofs. 

Proof of Lemma 3. It would be enough to show that V = EO. First, it 



is easy to show that B^B™ 2 = B ^ m ^ m21 . We have MU = diag(l Zl , 1 
1 J and MZ = (Ijl ■ • • 5 &■ Then we have 



pr- L (MC/) 
Vi L {MU)MZ 



diag(B h ,...,B ln ), 



(E ll> B l2 ' R ln 'V 



n p 



Z'M'v^{MU)MZ 



M=l 1 = 1 
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Without loss of generality, we could assume h p > 0. Then one choice of the 
(/-inverse of Z'M' '^(MU)MZ is Yli=i9i B p where 

(45) g l= r- l -r-^ l<i<p-l, 

(46) 9 P = h-\ 

with hk = Ya=1 ^-h=ki 1 — & — Pi an d Ti = ^2\ =i hi denotes the number of 
subjects remaining at period i, 1 < i < p. Note that if h p = 0, the value of 
p in (45) and (46) should be replaced by p = max{/c : hk > 0}, and for k > p 
we let hk = 0. It is easily seen that the following arguments and thus the 
lemma would still hold. Now we have 

pT ± (MZ\MU) = pr 1 (MIT) - pr(pr ± (MU)MZ) 

= diag(J3 Il ,...,5,J-A, 

Vfc=l / i,j=l,2,...,n 

Let O = {Oij)i<ij< n = M' pr ± (MZ\MU)M, and then we have 



Oii = B l ^-Y,g k Bf^\ 

k=l 

P 

n — „ n min(k,li,lj) 

^ij — / j yk JD p 



k=l 

We will derive the expectation of Y%=i9kBp™ n ^ k '^ an d other components 
could be dealt with by similar arguments. First we have the decomposition 

j^gkBf^M = ^gkBl + 

li-l 



Y(--—]B k + —B h . 

^\r k r k+1 J p r h p 



When k <li and Zj is given, we know that — 1 follows the binomial dis- 
tribution with parameters n — 1 and a^p. Hence we have 

n 1 ^ 1 V 
E^lfe, * < Z.) = £ — H S[J m al P (l ~ ' j 

1 - (1 - a fcp )" 

= — := o fc . 

na fep 
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Hence we have 

/ v \ U-l 

E 9kBf< k ^ h,l<i<n\=J2(h- b k+l )B k p + 6,.J3j 
\fc=i / fc=i 

p 

= X^( &fe - frfc+i) 1 ^] + M[jt=y]-Bp. 
fe=i 

Here we have the convention of = for notational convenience. Hence 
^[Y,9kBf^ l A = Y,[{bk-b k+l )a k+l , p + b k {a kp -a k+hp )]B k p 

\k=l / k=l 

P 

= ^(a/cp^fc — ak+i,pbk+i)Bp 

k=l 

1 P 

= n ^Z^ a ik ~ a i,k-i) B p- 
k=l 

Following this strategy, it is easy to show that 

p 

EO« = J> fc - n- 1 ^ - o? )fc _i)]^. 
fc=l 

1 P 

EOjj = y^(afc + a k+ i yP a r -l k - akp a i,k-i)Bp- 

n k=i 

Then we have V = MO. □ 

Proof of Theorem 2. By definition of symmetric designs we have 

(47) = (S a4 ® I p )T d , 

Fad = {Sa,d ® Ip)F d , 

where S ad is a permutation matrix for subjects induced by a and (symmet- 
ric) d. Note that we have (S a>d ® Ip)'V{Sa,d <8> J p ) = V. So C^j, 1 < i,j < 2, 
are completely symmetric and hence Cd is completely symmetric for a sym- 
metric design d. This yields 

Cd = Cd, 

and the equality in (10). By (47) we have T = TS a for any a £ B and hence 

T = n~ 1 l p l' t . Hence we have T = TBt = 0. By the same argument we have 

F = 0. Then the equality in (12) holds, and so does the equality in (13). 
Hence we proved tr(C^) = q d - 
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Given any design d with corresponding P = (p s ,s £ «S), we could define a 
new design d = (j9 s , s G 5) by 

(48) p s = E ° e ; Pad - 

Then we have £) se5 p a «s(aO = E se s Ps£s (») in view of ( 36 ) and Qd = D 

Proof of Theorem 3. In the following, we would apply Lemma 3.1 
of Kushner (1997a) to prove (hi). The proof of (i) and (ii) follows from 
similar arguments. Given any sequence s, we have q s {x) = ^ P k=rn ct k q k {x) 
where q k (x) = q* n + 2q k 12 x + q k 22 x 2 and q k sij = tr(G-^Gj) with G?i = f u 
and G 2 = F u . By direct calculation we have 

~ k — is k /k, 



isll 

Zsl2 

l k 
/s22 



(kps k + fs k ,t k - & J A, 

(^ _ l)(fc - 1)/H - (6 fc " 2/ Sfc , tfc + l)/fc, 



where £ Sfc = £]* =1 (/ Sfe ,i) 2 and /0 Sfe = X^=i lt i= t J+ i • For notational simplicity 
we dehne £ k = £ Sfc , p k = p Sk and f k = f Sk ,t k - Also let E A ,A C {x,/c,p,i}, 
denote a quantity that depends on the elements of A, and a oc^ b means 
that a/6 is a quantity that only depend on k. Then 

(49) q k (x) <x fe -£ fe (x + l) 2 + 2f k (x + x 2 ) + 2kp k x + E kjt)X 

= - Vk){x + l) 2 + 2k(p k - f k )x 

(50) +2/ fc [(fc-l)s-l]+E M>a .. 

From (49), for any x > 0, the sequence which maximizes q k (x) has to be of 
the form (1 * l /sfc l |2* l /sfc 2 1 |(t - 1) * l/. fc , t _i |t * l/. fc ,«) with the restric- 
tions of f Sk ,i+i > f Sk ,i, i = 1, 2, . . . , t- 1 and / Sfe ,t-i - / Sfc ,i < 1. For the special 
case of k < t, the sequence reduces to the form of (1, 2, . . . , k — h,t * By 

(50) the sequence of (re) maximizes o s (x) for any x € ((p — 1)~ , (p — 2) _1 ] 
since this sequence maximizes q k (x) for all k = m, . . . ,p. Since all the se- 
quences in the class of (re) have the same value of d(q s (x))/dx, we need to 
choose x so that the derivative is zero, and hence (iii) is proven. □ 

Proof of Theorem 4. By Lemma 4, equations (31)-(35) is equivalent 

to 

(51) C dll +x*C dl2 B t = ^B u 

(52) C d21 +x*C d22 B t = 0, 

(53) B(f + x*f) = 0. 
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First we show the necessity. Let / be a symmetric optimal design and 
g be a new design with P g = P d /2 + Pf /2. Then by Lemmas 1 and 5 we 
have 

C g >C d /2 + C f /2 

( 54 ) 

= ^-B t . 
t-1 

Let ^ be the symmetrized version of design g as defined by (48). Following 
the same argument as in Lemma 1 we have 

(55) Y, S '^aSJ\V\<C- g . 

a<=V 

Combining (54) and (55) we have 

Co = — — — Bf, 

in view of Corollary 2(ii). Then we have tr(Cd) = ny* which together with 
(54) yields 

Following similar arguments as in Theorem 5.3 of Kushner (1997b) we have 

( 56 ) Cf 2 2C g22 Cg2l = C/21, 

(57) C d22 C^ 22 C g2 \= C d2 i, 

where G + denotes the Moore-Penrose inverse of G. Since / is a sym- 
metric design, we have C/21 = qfuBt/(t - 1) and C/22 = q^Bt/it - 1) + 
(l' t C f22 l t )J t /t 2 . So we have C+ 22 = (t - l)B t /q f22 + J t /(l' t C f22 l t ). By left 
multiplying both sides of (56) we have 

U g22 L/ g21 - ^/22 C /21 

(58) 

= -x*B t . 

By plugging (58) into (57) we have (52). Then we have 

id /s /s— 

-tit — W — Orfu — ^ d \2^ d22 ^ d2\ 



t-1 

= Cdn + x*C d \ 2 C d22 C d22 B t 
= C dl i +x*C d i 2 B t . 
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Hence (51) is derived. Prom (10) and (5.3) of Kushner (1997b) we have 
ny* = tr(C d ) 
< tx(C d ) 



(59) 



< tr(C d n + 2xC dl2 + x 2 C d22 ) 

= q d (x) - ntv[{f d + xt d )'B(f d + xf d )\. 



Setting x = x* in (59) gives y* <y* - tr[(T d + x*F d )'B{T d + x*F d )) which 
yields (53) due to Pukelsheim [(1993), page 15]. 

Now we show the sufficiency. By utilizing (51), (52) and (53) we have 

C d \\ + x*C d i 2 B t = j—^Bu 



which in turn yields 



C d21 + x*C d22 B t = 0, 

C d = Cn + x*C d i 2 C~ L22 C d22 B t 
ny* 



t-t 



-B, 
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